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ABSTRACT 

A new, very fast method for 3D radiative transfer on fully threaded grids with arbi- 
trarily high angular resolution is presented. The method uses completely cell-based 
discretization, and is ideally suited for problems with diffuse background radiation, 
often encountered in cosmological and star formation models. We find that for accu- 
rate statistical study of intergalactic Lya absorption lines one needs of order of few 
hundred angular discretization elements even for models without radiative feedback 
from star forming galaxies. 
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1 INTRODUCTION 

There are many astrophysical systems which require so- 
lution of the 3D radiative transfer equation. In the last 
decade, start ing perhaps from the loc al optical depth ap- 
proximation iGnedin fc Ostrikerlll997l) . numerical transfer 
solvers started to appear in cosmological and galaxy for- 
mation simulations. A wide range of approximations have 
been used to deal with the multi-dimensional nature of the 
transfer equation. Among the most interesting original de- 
velopments one can mention an implici t variable Edding- 
ton tensor method iNorman et alJll99Slh a massively par- 
allel multiple wavefront imp lementation by the Tsukuba 
group tUrnenrurj^^t,^!!^^!) . long characteristics on a uni- 
form grid ( Razoumov fc Scottll99g|) . Monte Carlo transport 
iCiardi et al.ll2001 ), optically thin v ariable Eddington ten- 
sor approach iGnedin fc Abelll200lD. adaptive ray tracing 
around point sources J Abel fc Wandeldl2002lh nested trees 
of rays and sources IIRazoumov et all2002i) , and the int egral 
transfer method via fast Fourier transforms JCenl2002ll . De- 
pending on the problem, many of these methods effectively 
reduced the number of operations for single-frequency trans- 
port from 0(N 4 N BTC ), where N is the number of data points 
in each spatial dimension, and A src is the number of sources, 
to scaling whi ch is closer to, but not t he same as 0(N 3 ). 

Recently, lJuvela fc Padoar] feOOot) proposed a new ra- 
diative transfer method for line emission on multi-resolution 
grids using a combination of long and short characteristics. 
In their implementation the radiative transfer equation is 



* e-mail: razoumov@phy.ornl.gov 
f e-mail: ccardall@mail.phy.ornl.gov 



solved on separate grids, and intensities ar e interpolated 
at the grid boundaries. On the other hand, ICardall et alJ 
( 2005) developed a method to solve the Boltzmann equa- 
tion for neutrino transport on fully threaded grids. The 
fully threade d data mod e l pion eered in astrophysical hydro- 
dynamics by iKhokhlovl il998l) is very efficient in dealing 
with multi-resolution phe nomena. In this paper w e extend 
the ray-tracing scheme of lJuvela fc Padoanl j2005l) to fully 
threaded structures moving from ray-based to cell-based 
discretization, and find that the resulting algorithm is ex- 
tremely fast and does not require any significant additional 
memory allocation on top of a hydrodynamical solver. 



We will show that in general, in order to recover the 
proper spatial distribution of the mean intensity in cosmo- 
logical models, one cannot (and does not necessarily want 
to) achieve a scaling much better than 0(N 3 N ang i es ). 



We limit transport to plane-parallel fluxes which enter 
the computational volume at arbitrary angles, assuming for 
now that there are no point sources of radiation in the vol- 
ume. Extended sources, i.e. those covering at least few dozen 
cells can be accurately modeled with plane-parallel trans- 
port with a sufficient number of angles (current scheme), 
whereas an unresolved star forming region would need a 
separate transfer scheme around point sources on refined 
meshes which we have also developed and will present sep- 
arately. Therefore, in this paper we limit our tests to a cos- 
mological model with no feedback. 
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2 METHOD 

2.1 2D algorithm 

We start with a 2D example to demonstrate the basic ideas 
of fully threaded transport, and extend the scheme to 3D 
in the next subsection. Consider a 2D Cartesian grid of size 
N x x N y , each element of which can be refined recursively by 
a factor of two in each dimension, resulting in four subcells 
within a refined element (Fig. 0. We tag individual cells 
with a variable-length label according to their position in 
the tree hierarchy. 

Assume that a plane-parallel wavefront enters the grid 
in some direction 9, which we restrict to 7r/2 < 9 < n. Let us 
discretize the wavefront with N x long characteristics (thick 
lines in Fig.0 separated in x by the base grid resolution Ax, 
so that each characteristic starts in its own base grid cell at 
the bottom of the computational grid (cell 1, parent cell of 
21-24, and parent cell of 31-34 in our example), and crosses 
the entire grid. Let us now divide each long characteristic 
into segments falling into individual cells, and consider those 
ray segments to be elements of their parent cells. Each ray 
segment can be described by several attributes such as the 
geometric length and the optical depth, the orientation of 
the side it starts on (lower horizontal or left vertical, for 
7r/2 < 9 < tt), and the position of its origin on that side. 
Using a terminology which will be particularly useful in 3D, 
the ray segments starting on the lower horizontal side of 
a cell we call the x-rays, and the ones starting on the left 
vertical side we call the y-rays. In Fig. the cell 1 has only 
one x-ray, whereas cell 21 has both an x- and a y-ray. Note 
that for our choice of angle 7r/2 < 9 < -k every cell on any 
level of refinement will have exactly one x-ray, and zero or 
one y-ray. Depending on geometry, these segments might be 
very short, e.g., the x-ray in the cell 4, but this ray count rule 
applies to all cells on all levels, and is central to our scheme. 
One can easily notice that any cell will have only the x-ray 
and no y-ray if the x-ray ends at the upper horizontal side 
of this cell, and an additional y-ray if the x-ray ends at the 
right vertical side. 

We discretize the rays entirely on the grid storing all in- 
formation in computer memory in the form cell : rayType : 
ray Attribute, and each cell is in turn an element in the grid 
tree hierarchy. Inside each cell we measure all lengths and 
coordinates in units of that cell's size, independently of its 
level of refinement. 

We now note the second important property of such ray 
construction. Since all rays are parallel and separated in x 
by the local grid resolution Aa; (which varies from region 
to region), the ray geometry inside each cell - number of 
segments, their lengths and orientation - is the same for 
all cells of a given refinement level within each horizontal 
layer. For example, in Fig.^cells 21, 22, 31 and 32 have the 
same ray geometry, and so do the cells 511, 512, 521, 522. 
Similarly, the cells 513, 514, 523 and 524 have their own 
geometry. It immediately follows that the ray geometry has 
to be computed only once per level, per layer. 

We start calculation by computing the ray geometry of 
cell 1 and recording it in pattern Pi. Next we sub-divide 
this pattern into two elements Pn and P12, the first one 
containing the ray geometry of cells 21, 22, 31 and 32, and 
the second one - ray geometry of cells 23, 24, 33 and 34. 
In the second base grid layer we have one additional level 



of refinement, and store the patterns P2, P21, P211, P212, 
and P22- In each cell we record a pointer to its pattern, 
although this association can also be recovered from position 
of that cell in the tree, with a few extra integer operations. 
In addition, we compute the optical depth along each ray 
clement. 

Before actual transport, we introduce the concept of x- 
and y-neighbours of each cell which are the cells into which 
the x- and y-ray "back-trace" into. Neighbours can be of 
the higher, lower, or same level of refinement as the current 
cell. Cell 1 in Fig. does not have any neighbours since its 
only x-ray starts on the grid boundary. Cell 21 has 1 as its 
y-neighbour, since its y-ray starts on the interface with cell 
1, but it has no x-neighbour. Cell 511 has one x-neighbour 
(23) and one y-neighbour (4). Cell 53 has one x-neighbour 
(514) and one y-neighbour (4), and so on. For each cell C we 
store its x- and y-neighbours as pointers C : X and C : y. 

Next we perform transport, starting from cell 1, contin- 
uing in cells 21, 22, 23, 24, 31, 32, 33, 34, 4, 511, 512, and 
so on until we cover the entire grid. In each cell we start by 
computing transfer of the intensity along the x-ray. The in- 
put intensity C : xray : Ji n is either the cosmic background 
intensity (cells 1, 21, 22, 31, 32, i.e. those along the lower 
edge of the grid), or the outgoing intensity of the respective 
x-neighbour, either C : X : xray or C : X : yray, depending 
on which ray segment hits the horizontal interface between 
C and C : X. Note that by construction only one ray seg- 
ment in C : X reaches this interface, so there is no ambiguity 
in the choice of the incoming intensity C : xray : I[ n . When 
we continue a ray from a lower-resolution into a higher- 
resolution cell, i.e. when C : level > C : X : level, we do 
not interpolate across the wavefront and instead just propa- 
gate the "un-refined intensity" C : X : xray : lout- We found 
that in practice the error associated with this approximation 
is negligible, since it is in the refined region that radiation 
starts to show a non-zero gradient across the wavefront on 
scales smaller than the size of C : X. 

Going back to our example, to get the input x-ray in- 
tensity in both the cells 511 and 512, we use the outgoing 
x-ray intensity from their x-neighbour, cell 23. In cell 53 
the incoming x-ray intensity is the outgoing y-ray inten- 
sity from cell 514 (the thick line in Fig. 0. After an x-ray 
update in each cell we proceed to a y-ray update in that 
cell if a y-ray segment is present. The incoming y-ray in- 
tensity is again either the cosmic intensity for border cells 
(cell 4), or the outgoing x-ray (only!) intensity of the cor- 
responding y-neighbour. Note that by construction a y-ray 
in any cell never reaches the cell's right vertical side, al- 
ways ending at the upper horizontal side. The only special 
case for a y-update which does not arise in x-updates is 
when C : y : xray ends on the upper horizontal side of 
C : y, instead of the right vertical side. This in fact hap- 
pens in cell 21 for which its y-neighbour's x-ray (cell 1) 
does not end on the right vertical side. In this case we use 
C : yray : L m = cell : y : xray : I out . 

2.2 3D unigrid implementation 

Now consider a 3D uniform Cartesian grid of size N x x N y x 
N z and a plane-parallel wavefront entering the volume in 
some direction <f>, 9. As in the 2D case, we limit the angles 
(j), 9 in such a way that a ray entering a cubic cell through 
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its lower left front corner (x — y = z — in Fig. |5J would 
leave this cell through its upper xy-face without hitting the 
side faces first. Such restriction on the angles means that 
for now (f>,9 can cover only a solid angle of 7r/6. We dis- 
cretize the wavefront with N x x N y long characteristics, 
each one of which starts in its own base grid cell at the 
bottom (z = 0) of the computational volume and crosses 
the entire grid until it leaves the volume. In analogy with 
the 2D scheme each long characteristic is divided into cell 
segments, but now each segment can start on a xy-plane, xz- 
plane, or yz-plane (Fig. |5J . We continue the notation from 
the previous section and call these segments the xy-, xz-, 
and yz-rays, respectively, and consider them to be elements 
of their parent cells. As before, Ci,j,k '■ xyray : xq stands for 
the x-coordinate of the starting point of the xy-ray inside 
cell Ci t j t k, and we assume that inside each cell all lengths 
are measured in units of that cell's size, so that the start- 
ing position of each long characteristic is described by the 
same pair of numbers Ci t j t k : xyray : xo,yo which do not 
depend on indices j, k, and we wrote i = 1 since all long 
characteristics start in the bottom layer cells. 

As long as 0, stay inside the solid angle we chose 
above, each base grid cell in the volume will have exactly 
one xy-ray in it, zero or one xz-ray, and zero or one yz-ray. 
Since the grid is Cartesian, and we have fixed the angles, 
all base grid cells in each layer i have exactly the same 
set of rays segments, which can be conveniently stored in 
a pattern Pi, to which each cell in that layer has a pointer 
Ci,j,k '■ P — * Pi- We have to compute and store this pattern 
only once per layer. The starting points Pj+i : xyray : xo, yo 
for each subsequent layer can be computed entirely from Pi. 

Without refinement one can compute radiative transfer 
one layer at a time, starting from the very bottom layer 
i = 1. In each layer, except for i = 1, the first step would be 
to compute a pair of starting points Pi : xyray : xo,yo from 
Pi-i, and then calculate the entire pattern Pi starting from 
the length of the xy-ray and the type of the plane it finishes 
on (xy, xz or yz). If the xy-ray ends on the xy-plane, then it 
is the only ray segment in this pattern. However, if it finishes 
on the xz-plane, there will be an xz-ray inside this pattern 
(Fig. |5J. If the xz-ray is present, it will end on either the 
xy- or yz-plane, but not on xz-plane, due to the restriction 
in angles we chose above. In the latter case we'll have one 
yz-ray in this pattern, and so on. For each additional ray 
segment we need to compute its starting point, either Pi : 
xzray : xo,zo or Pi : yzray : yo,zo, and the length of that 
ray segment. We also record, for quick reference, the type of 
the ray that hits each of the planes (xy, xz, yz) in that cell. 

Once the layer pattern is computed, we can use it for 
every cell in that layer. In each cell Cij,k we start by doing 
transfer along the xy-ray. The starting intensity is either 
the cosmic background intensity (i — 1), or the outgoing 
intensity of the ray segment (xy, xz, or yz) in Ci-ij,k that 
hits the upper xy-plane in that cell. An angle-dependent 
source function discretized on this ray segment can also be 
added to the transport step. Next, if the segment Cij t k '■ 
xzray is active, we perform corresponding transfer of the 
incoming intensity, which is taken to be either the cosmic 
background (j = 1), or the outgoing intensity of the ray 
segment in dj-^^ that ends on the xz-plane of that cell. 
A similar update is done for the segment Cij^k '■ yzray if it 
exists in this cell. 




Figure 1. Ray geometry in 2D with two levels of refinement. The 
arrow shows the photon travel direction. The lines on the right 
show the hierarchy of patterns which hold the ray geometry in 
the order these patterns are created, from left to right. The long 
rays on the base grid are drawn with thick lines, and dash-dotted 
lines indicate ray segments on the second (highest in this case) 
level of refinement. 



2.3 3D implementation with refinement 

Let us finally consider a fully threaded 3D data structure in 
which each cell Cij t k can be refined, in which case it con- 
tains eight subcells Ci t j t h : Cj/^fe/ of size 1/2 of the parent, 
and assume that this refinement can proceed to any arbi- 
trary level. Then the ray pattern Pi for each layer will also 
be recursively refined to form structures of the type Pi : P^ , 
up to the deepest level of cell refinement in that layer. Unlike 
in the unrefined case, each ray segment is not necessarily a 
simple continuation of corresponding ray segment in a neigh- 
bouring cell. If we just entered a refined region, new ray seg- 
ments are created, and there are several strategies one can 
use to refine intensities across the wavefront. In complete 
analogy with the 2D construction above, we choose not to 
interpolate the intensities, but just to use the outgoing in- 
tensity of the cell into which the refined ray "back-traces" . 
However, we need an extra step to find out which cells the 
xy-, xz- and yz-rays "back-trace" into, or what we call the 
xy-, xz- and yz-neighbours of the current cell. For each ray 
segment present, we store the pointers to their respective 
neighbours, which can be of the same, higher or lower level 
of refinement. Note that every cell containing, e.g., an xz-ray 
must have an xz-neighbour, unless it starts on the boundary 
of the computational volume. 

As in the unrefined scheme, before we perform transfer 
in a particular layer, we need to compute the base grid ray 
pattern for that layer. We then proceed to perform transfer 
in all base grid cells, following xy-rays and, if present, xz- 
rays and yz-rays, from their respective neighbours. When 
we hit a refined cell, we need to compute the corresponding 
refined ray pattern to which this cell has a pointer, if that 
pattern has not been computed before, and do that for each 
cell and refinement level in the layer. Each pattern at each 
level of refinement needs to be computed only once per layer. 

The radiative transfer calculation itself is very fast, 
since all we need to do is follow the interconnected data 
structures of cells and ray segments we have already created 
for a particular choice of angles <f>, 6. We always compute ra- 
diative transfer on the deepest level of refinement available, 
thus eliminating the need to store intensities at the bound- 
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Figure 2. Ray geometry in a 3D cell. There is always one xy-ray 
which starts at point A (lower xy-face), and for this particular 
orientation the cell also contains a yz-ray starting at point B (left 
yz-face), and an xz-ray starting at point C (front xz-face). 

ary between regions of different refinement levels. The only 
floating point operations we need are the pattern calcula- 
tion (0(N Z )), the calculation of the optical depth r in each 
cell (0(N X x N y x N z )), and the optical depth attenuation 
e~ T (0(N X x N y x N z )). If we include emission, we also 
need to compute the source function (0(N X x N y x N z )). 
There are one, two or three ray segments in each cell, 
the average number falling between one and two and de- 
pending on the grid size and hierarchy complexity. The to- 
tal number of operations per iteration or timestep is then 
0(N X x N y x N g x Nangies), plus some overhead for refine- 
ment. 

To generalize this scheme for any direction <f), 8, we just 
need to rotate the indices used to access the grid elements, 
both on the base grid and in refined cells, to cover all 24 
angular zones of 7r/6, three (xy, xz, or yz) for each of the 
eight octants in a cube. 

Fo r discretization in a ngles we use the HEALPix algo- 
rithm jGorski et alj feof^. dividing the entire sphere into 
12 x 4 n_1 equal area pixels, where n = 1,2, so that in 
computing the mean, angle-averaged intensity (which then 
goes into the ionizational balance module) all quadrature 
terms have the same weight. The angle-dependent intensity 
in each cell is obtained by averaging over all ray segments 
in a given direction present in that cell. We do not correct 
for the fact that the center of the ray segment does not co- 
incide with the center of the cell, since in our discretization 
the intensity is viewed more as a cell-averaged quantity. In 
addition, we find that averaging over few (one, two or three) 
ray segments cancels the errors faster than interpolation to 
the cell center. 

Finally, the directional intensity along each ray segment 
is computed as the average over that segment 

/ r \ ^in / -ft! j, ^in / 1 —kL\ *m ^out /1 \ 

^ = Tj Q e dl =^~ e ) = ln(/ m //o Ut ) ' (1) 

where L is the geometrical length, and 7i n and 7 ou t are 
the incoming and attenuated intensities at the ends of that 
segment. Using this form instead of the simple average 
(Jin + 7out)/2 ensures that we get a correct answer in cells 



which have large absorption but are nevertheless exposed to 
a strong radiation background. 



3 APPLICATION 

We ran a number o f stand ard tests of the type described in 
iRazoumov fc Scotd J1999T) to make sure that we recover the 
proper mean intensity on uniform grids. To test the method 
with cell refinement, we took an output from a cosmological 
simulation compute d with Enzo, an Euleria n AMR struc- 
ture formation code fervan fc Normanlll999l) . in the 8 Mpc 
(comoving) volume at z — 3 at 128 3 base grid resolution, 
assigned some constant cross-section a which is typical - 
within a factor of two - for photons above the Lyman limit 
propagating in the intergalactic medium, and ran a simple 
optical attenuation calculation with the variable number of 
angles 12 x 4 n_1 = 12,48, 192 and 768. The entire dataset 
included 501 nested grids with up to five levels of refinement, 
which for the purposes of our transfer we projected onto a 
fully threaded data structure. Full calculation - computing 
ray geometry, finding neighbours and doing actual transport 
- takes about 3 seconds in one direction on the full tree hier- 
archy on a 2 Ghz Pentium 4 processor. In Fig.[3]we plotted 
the angle- averaged intensity in a cross-section through one 
of the dense halos, in units of the mean cosmic background. 

At lower angular resolution the radiation field is ex- 
tremely clumpy forming long shadows behind shielded and 
semi-shielded regions. Visually, the quality of the solution 
improves drastically when we switch from 48 to 192 angles, 
and the latter seems to be an almost converged solution as 
there is very little difference in radiation fields computed at 
192 and 768 angles. We conclude that for a volume of several 
Mpc on a side at z = 3 one needs at least a hundred angular 
resolution elements to compute the diffuse component. In 
general, this number depends on the size of the volume, the 
redshift and the photon frequency. 

Fig.EJshows the mean intensity as a function of the local 
density using data from all cells on all levels of refinement, 
for a high angular resolution model. The vertical errorbars 
demonstrate a large r.m.s. mean intensity dispersion for all 
cells of the given density, indicating that radiative transfer 
effects are not negligible in modeling the thermal state of 
the intergalactic medium. 



4 DISCUSSION 

In our implementation, since transfer is done layer by layer, 
there is no need to store any transfer-related variables in 
more than two base grid layers at any given time, the cur- 
rent layer and the one below. Of course, one would prefer 
to keep grid variables such as density, temperature and ion- 
ization state of the entire volume in memory at once, oth- 
erwise these variables would need to be read from the disk 
every new timestep or iteration. Overall, if all cells are kept 
in memory, our implementation does not require any more 
storage than a hydrodynamical part. For parallel implemen- 
tation, if the entire cell hierarchy can fit on one processor, 
parallelization via angle decomposition is preferable to vol- 
ume decomposition. 
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Figure 3. Color map of the mean intensity in a slice through the 8 Mpc (comoving) volume, in units of the mean background intensity, 
at z = 3, for a 128 3 run with 5 levels of refinement by a factor of two. The upper left panel shows the calculation with 12 angles, the 
upper right - with 48 angles, and the bottom panels are with 192 and 768 angles, respectively (see text for details). Only 3 levels of 
refinement are plotted for simplicity. 




g(density) 



Figure 4. Mean intensity, in units of the mean background inten- 
sity, as a function of density, in units of the mean density in the 
box, grouped into 30 bins. The errorbars show r.m.s. dispersion 
in each bin. 



In a hydrodynamical structure formation code, radia- 
tive transfer is followed by calculation of ionizational bal- 
ance and gas temperature. We find that for large hydrody- 
namical time steps 10-20 iterations are normally required to 
compute an equilibrium position of HI, Hel and Hell ioniza- 
tion fronts. Interestingly enough, fewer iterations are needed 
to obtain a solution in higher spatial resolution models in 
which cosmological filaments and sheets are better resolved 
and tend to leave more open space for radiation to reach the 
higher density regions. 
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